Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(proxy-cache): set Header Accept-Encoding as default values of vary_headers #12867

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

git-hulk
Copy link
Contributor

@git-hulk git-hulk commented Apr 16, 2024

Summary

Currently, proxy-cache might mix the compression and uncompression values while caching, and return unexpected compression response to the request if users didn't specify the vary_headers.

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

Fix #12796

@CLAassistant
Copy link

CLAassistant commented Apr 16, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions github-actions bot added plugins/proxy-cache schema-change-noteworthy cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee labels Apr 16, 2024
…ary_headers

Currently, proxy-cache might mix the compression and uncompression
values while cacheing, and return runexpected compression response to
the request if users didn't specify the vary_headers.
@git-hulk git-hulk force-pushed the fix/mix-response-in-proxy-cache branch from 1e60671 to 07dcb59 Compare April 16, 2024 13:48
@team-eng-enablement team-eng-enablement added the author/community PRs from the open-source community (not Kong Inc) label Apr 16, 2024
Copy link
Contributor

@StarlightIbuki StarlightIbuki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will add a changelog for this and approve the PR after it passes the tests

@pull-request-size pull-request-size bot added size/M and removed size/S labels Apr 17, 2024
@git-hulk
Copy link
Contributor Author

@chronolaw @oowl could you help to take a look at this PR?

@@ -72,6 +72,7 @@ return {
elements = { type = "string" },
}},
{ vary_headers = { description = "Relevant headers considered for the cache key. If undefined, none of the headers are taken into consideration.", type = "array",
default = { "Accept-Encoding" },
Copy link
Member

@oowl oowl Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that have introduced breaking change. I am not sure if it is safe for our old version user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @dndx Can you help us to check it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. Guess it shouldn't be regarded as a breaking change since it only affects how the cache key is calculated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't consider fixing a broken behavior a breaking change. ping @dndx

@StarlightIbuki
Copy link
Contributor

StarlightIbuki commented Apr 19, 2024

I did not think it through when I wrote the original comment.

The RFC says:

An origin server might send Vary with a list of fields for two
purposes:

  1. To inform cache recipients that they MUST NOT use this response
    to satisfy a later request unless the later request has the same
    values for the listed fields as the original request (Section 4.1
    of [RFC7234]). In other words, Vary expands the cache key
    required to match a new request to the stored cache entry.

So it's a must to support Vary header to comply with RFC.

I found a todo in the plugin's implementation:

-- TODO handle Vary header

It's better to fix the issue here. Plus this PR could still be useful. If an upstream does not make use of the Vary header correctly, we are supporting it by default (though doubtful, as the upstream may want it to not vary).

Also, it's noteworthy that graphql-proxy-cache-advanced uses a similar way to handle the cache_key.

@git-hulk
Copy link
Contributor Author

git-hulk commented Apr 19, 2024

@StarlightIbuki So do you suggest adding the Vary response header while hitting the cache in this PR?

@StarlightIbuki
Copy link
Contributor

@StarlightIbuki So do you suggest adding the Vary response header while hitting the cache in this PR?

We need to add content from the Vary header to the list of vary_headers before generating the cache key. And to get a constant result we may need to sort the header names.

@git-hulk
Copy link
Contributor Author

git-hulk commented Apr 19, 2024

@StarlightIbuki So do you suggest adding the Vary response header while hitting the cache in this PR?

We need to add content from the Vary header to the list of vary_headers before generating the cache key. And to get a constant result we may need to sort the header names.

@StarlightIbuki Thanks for the clarification. I misunderstood your point. I'll make sure to recapture your point and help to correct me if I'm wrong.

  1. Check if the upstream response has the Vary header, and add it into vary_headers if exists and it has not been added before.
  2. sort the vary_headers and build the cache key, then cache the response.

That said if we have the config.vary_headers = {"A"} and the upstream server sends two responses with Vary: A,B and Vary: B,C, then the config.vary_headers will finally become {A, B, C}. Is it right?

By the way, I'm not sure if it's good to add this in another PR to make the context more clear? I'm also happy to do that.

@StarlightIbuki
Copy link
Contributor

@StarlightIbuki So do you suggest adding the Vary response header while hitting the cache in this PR?

We need to add content from the Vary header to the list of vary_headers before generating the cache key. And to get a constant result we may need to sort the header names.

@StarlightIbuki Thanks for the clarification. I misunderstood your point. I'll make sure to recapture your point and help to correct me if I'm wrong.

  1. Check if the upstream response has the Vary header, and add it into vary_headers if exists and it has not been added before.
  2. sort the vary_headers and build the cache key, then cache the response.

That said if we have the config.vary_headers = {"A"} and the upstream server sends two responses with Vary: A,B and Vary: B,C, then the config.vary_headers will finally become {A, B, C}. Is it right?

By the way, I'm not sure if it's good to add this in another PR to make the context more clear? I'm also happy to do that.

Thanks for offering help. Perhaps a dedicated PR for Vary header support is good.

Note we also need to make sure there are no duplicated headers.

@git-hulk
Copy link
Contributor Author

Note we also need to make sure there are no duplicated headers.

Sure, thanks for your reply. I will take care of this condition. I won't submit an issue to track this enhancement since Kong's issue is only for bug reports.

BTW, I'm not sure how to push this PR forward to reviewing. To see if any maintainers can help.

@StarlightIbuki
Copy link
Contributor

BTW, I'm not sure how to push this PR forward to reviewing. To see if any maintainers can help.

Please understand that we want the changes to Kong to align with the roadmap and the overall design, and it takes time to decide the solution. This PR particularly needs no more effort from your side and I will try to get some attention for this topic.

@StarlightIbuki
Copy link
Contributor

Internally tracked: KAG-4292

@git-hulk
Copy link
Contributor Author

@StarlightIbuki That's great, thanks a lot for your help.

@git-hulk
Copy link
Contributor Author

Hi @StarlightIbuki, want to know if this fix discussion has any progress?

@chronolaw
Copy link
Contributor

chronolaw commented May 31, 2024

I checked the ticket (KAG-4292), it seems that there are still some discussions on it, no decision is out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author/community PRs from the open-source community (not Kong Inc) cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee plugins/proxy-cache schema-change-noteworthy size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in proxy-cache plugin returning compressed cached content to clients not supporting compression
6 participants